Using Hadoop to Identify False Positives in Bacterial Strain Typing from DNA Fingerprints
نویسنده
چکیده
Pyroprinting is a novel technique used by the Department of Biological Sciences to obtain “fingerprints” from the DNA of E. coli isolates in order to categorize them into strains. To determine the number of false positives that occur in the pyroprinting process, isolates with the same pyroprints needed to be sequenced to see if their underlying alleles match. If they do match, this shows they are indeed the same strain and are a true positive. If the alleles don’t match, they are different strains and are a false positive. To do this 100 isolates with nucleotide identifiers were sequenced. Over five million sequences were then analyzed using a program implemented on Hadoop. This program provided a general indicator of the efficacy of pyroprinting by grouping the sequences into their respective isolate buckets and analyzing them to determine which were false positives. The Hadoop implementation proved to be reliable and highly scalable. This method of analysis is generally applicable to many areas within bioinformatics, as well as potential uses in other industries. The results from the experiment are still being analyzed to determine the frequency of false positives, and how this can inform the use of pyroprinting. Keywords—Hadoop, Distributed Computing, Pyroprinting, Bacterial Strain Typing, E. coli, DNA
منابع مشابه
Molecular Methods for Bacterial Strain Typing
ABSTRACT Typing of bacteria is an important part of epidemiological studies on nosocomial infections. Bacterial identification methods have dramatically improved in recent years, which is mainly due to advancements in the field of molecular biotechnology. In many cases, molecular techniques have replaced phenotypic typing methods. Currently, a wide r...
متن کاملTyping of Toxigenic Isolates of Clostridium perfringens by Multiplex PCR in Ostrich
Clostridium perfringens is an important pathogen that provokes numerous different diseases. This bacterium is classified into five various types, each of which capable of causing a distinct disease. There are various methods for the bacterial identification, many are labor-intensive, time-consuming, expensive and also show low sensitivity and specificity. The aim of this research was to identif...
متن کاملAn Efficient Strategy for Broad-Range Detection of Low Abundance Bacteria without DNA Decontamination of PCR Reagents
BACKGROUND Bacterial DNA contamination in PCR reagents has been a long standing problem that hampers the adoption of broad-range PCR in clinical and applied microbiology, particularly in detection of low abundance bacteria. Although several DNA decontamination protocols have been reported, they all suffer from compromised PCR efficiency or detection limits. To date, no satisfactory solution has...
متن کاملPCR Typing of Trichophyton Rubrum Isolates by Specific Amplification of Subrepeat Elements in Ribosomal DNA Nontranscribed Spacer
Background: Trichophyton rubrum (T. rubrum) is the most common cause of dermatophytosis of skin and nail tissue. Strain identification in Trichophyton rubrum is important for identification of strain-related differences in infectivity potential or transmissibility and epidemiological studies. PCR typing could determine whether the original isolate is responsible for re-infection or a new strain...
متن کاملMolecular identification and capsular typing of Pasteurella multocida isolates from sheep pneumonia in Iran
Pasteurella multocida is known as one of the main organisms causing pneumonia in sheep. As immunity in pasteurellosis is serogroup specific, identification of prevalent capsular group among endemic areas is essential. The aim of this study was to molecular identification and determine of the capsular type of the P. multocida strains isolated from sheep pneumonia in Iran. Bacteriological and bio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016